DOC  FILE  COPY  *4071105 


< 


Contract  NM#l4-76-C-Hfc85  I / 


Systems  for  large  fata  Bests.  P C.  Istckemenn  ami  F.J.  XeuholJ.  (eds.  I 
North  Holland  Publishing  Company.  1976 


A DEDUCTIVE  CAPABILITY  FOR  DATA  MANAGEMENT 


J 


if  \ / 

Charles/Kellogg, j System  Development  Corporation,  Santa  Monica,  Calif. 
Phllip/Klahrj  System  Development  Corporation,  Santa  Monica,  Calif. 
LarryyTravis^  University  of  Wisconsin,  Madison,  Wisconsin 


This  paper  examines  some  of  the  problems  and  issues  involved  in 
designing  a practical  deductive  inference  processor  to  augment 
a data  management  system,  as  well  as  sene  of  the  benefits  that 
can  be  expected  from  such  an  auamentation.  A deductive 
processor  design  is  presented  that  incorporates  new  techniques 
for  selecting,  from  larae  collections  of  mostlv  irrelevant 
general  assertions  and  specific  facts,  the  small  number  needed 
for  deriving  an  answer  to  a particular  query, 
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In  this  paper  we  discuss  some  of  the  Issues  involved  in  addinq  a deductive 
capability  to  a data  management  system,  and  we  describe  a specific  approach  towards 
achieving  this  objective.  Figure  1 illustrates  the  major  components  of  cur 
deductive  data  management  system: 


• A language  processor  that  translates  user  input  Into  a formal 
Intermediate  symbolism. 

• A data  management  system*  that  retrieves  specific  facts  (n-tuoles 
of  data  values)  from  a data  base  as  required. 

• A deductive  processor  that  uses  general  assertions  (i.e.,  premises 
representing  general  rule-based  knowledge  about  a data-base  domain) 
to  derive  implicit  information  from  collections  of  explicit  data 

values. 

• A control  module  that  facilitates  communication  between  the  several 
components  of  the  system  and  directs  interaction  between  the 
deductive  processor  and  the  data  management  system  during  on-line 
question  answering. 


Listed  below  are  some  of  the  benefits  to  he  expected  from  addinn  a deductive 
processor  to  a data  management  system: 

• A deductive  processor  permits  the  extraction  of  information  that  is 
not  explicitlv  stored  but  that  can  be  inferred  bv  combining  soecific 
facts  in  the  data  base  with  rule-based  knowledge  encoded  in  aeneret 
assertions.  This  augmentation  of  the  information-retrieval  function 
can  be  especially  important  for  very  large  data-base  domains. 

VAT  B)j.  I'M 

*In  our  prototype  we  use  a relational  data  management  system  (sec  Codd  (1070), 
Date  (1075)).  The  research  described  in  this  paper  is  an  outqrowth  and  extension 
of  our  earlier  research  on  natural-language  data  management  (see  Kelloqg  et  al. 
(1971),  Travis  et  al.  (1073)). 
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Figure  1.  Deductively  Augmented  Data  Management* 
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• A deductive  processor  allows  a data  management  system's  language  to  he 
extended  and  adaoted  to  the  needs  of  narticular  users.  Thus  a user's 
language  can  be  uncoupled  from  the  particular  terms  and  categories  used 
in  organizing  a data  file.  This  is  essential  if  users  are  to  be  able 

to  use  a system  without  havina  a thorough  knowledge  of  its  file  structure. 
For  example,  a user  should  be  able  to  ask  whether  the  maternal  grand- 
father of  John  Kennedy  was  richer  than  his  paternal  grandfather  without 
knowing  the  respective  men's  names  or  that  the  file  is  structured  in 
terms  of  net  worth.  Relatively  powerful  inferential  mechanisms  are 
necessary  to  enable  full  use  of  such  descriptive  references. 

• A deductive  processor  not  only  generates  answers  to  specific  queries 
but  also  supplies  evidence  (lines  of  reasoning)  for  or  against  these 
answers.  In  some  cases,  the  svstem  may  supply  one  argument  leading  to 
“yes"  and  another  leading  to  "no"  (indicating  inconsistent  information). 

In  the  real  world  of  unreliable  reports  and  uncertain  facts,  this  kind 
of  response  will  in  many  cases  he  much  more  useful  to  a user  than  simple 
categorical  answers. 

• While  deduction  is  itself  a orecise  and  strict  process,  deductive 
arguments  can  use  premises  of  differing  deqrees  of  plausibility.  Since 
the  plausibility  of  a conclusion  (answer)  is  a function  of  the  plau- 
sibility of  the  premises  from  which  it  is  derived,  deduction  provides 

a basis  for  using  "soft"  information  in  a cnmnuter-hased  system.  The 
Important  thing  is  that  the  system  he  able  to  show  the  user  the  evidence 
for  a conclusion  as  well  as  the  conclusion  itself.  Deduction  can  also 
he  used  to  generate  multiple  distinct  arauments  for  a conclusion, 
thereby  supplying  additional  evidence  for  the  plausibility  of  its 
answer. 

• A deductive  processor  can,  'under  certain  circumstances,  supply  conditional 
answers  when  specific  dihect  answers  are  not  possible;  e.a.,  "Is  Joe 
Smith  eligible  for  a pension?--Yes , if  he  has  thirty  years  of  continuous 
service."  In  this  case  the  deductive  processor  identifies  a specific 
fact  about  Joe  Smith  that  it  needs  to  complete  an  argument  but  that  it 
cannot  find  in  the  files  to  which  it  has  access. 

• A deductive  processor  can  answer  "what-if"  and  other  kinds  of  high-level 
queries  that  are  difficult  if  not  impossible  for  present-dav  data 
management  systems. 

Each  of  these  capabilities  is  currently  demonstrable  within  our  prototype 
deductive  data  management  system.  Our  primary  concern  in  this  paper  is  to 
outline  our  deductive  system  and  to  give  examples  emphasizing  the  derivation 
of  Implicit  information. 

Research  on  mechanizing  deduction  has  been  conducted  primarily  in  the  areas  of 
question-answering  and  theorem  proving  within  the  broader  area  of  artificial  intelli- 
gence. Early  question-answering  systems  such  as  SIR  (Raphael  (1964)),  PROTOSYNTHEX 
(Schwarcz  et  al.  (1970)),  and  CONVERSE  (Kellogg  (I960,  1971))  relied  primarily 
on  set-inclusion  logic  for  their  deductive  capability.  Elliott  (1965)  developed 
structure-specific  procedures  (such  as  one  for  transitivity  and  symmetry)  for 
deriving  new  information  from  a file  of  specific  facts.  The  inferential 
capabilities  in  these  systems  were  limited  and  used  for  special  purposes. 

With  the  development  of  "resolution"  (Robinson  (1965)),  more  sophisticated  theorem- 
proving techniques  were  incorporated  into  ouestion-answering  systems,  most  notably 
in  Green  (1969)  and  in  Minker  et  al.  (1973).  General  statements  formulated  in  a 
first-order  predicate-calculus  symbolism  could  now  he  uspd  to  derive  new  informa- 
tion. Deductive  power  increased  considerably,  but  at  the  expense  of  increased 
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search  space.  This  led  to  a host  of  resolution  strategies  (Chanq  and  Lee  (1973)). 
Some  recent  approaches  to  deduction  have  offered  alternatives  to  resolution;  these 
Include  procedure-oriented  deductive  svstems,  exemplified  bv  PLANNER  (Hewitt 
(1971)),  and  natural -deduction  systems,  exemplified  by  Bledsoe  (1974)  and  Nevins 

(1974). 

Our  primary  concern  has  been  to  desian  a deductive  processor  that  will  support 
practical  data  management  in  realistic  environments  involving  lame  files  of  gen- 
eral and  specific  information.*  We  have  concentrated  on  the  problem  of  selecting 
from  such  large  files  the  few  premises  and  facts  that  am  relevant  for  a particular 
required  deduction.  We  have  adopted  some  of  the  deductive  techniques  used  in 
question-answering  systems  and  modified  them  to  be  more  suitable  for  data  manage- 
ment; we  have  also  introduced  new  "planning''  techniques  for  premise  selection. 

These  techniques  are  discussed  below. 

INFORMATION  STRUCTURES  TO  SUPPORT  DEDUCTIVE  DATA  MANAGEMENT 


Figure  2 Illustrates  the  principal  files  and  processors  that  constitute  our 
deductive  system.  Mote  that  the  deductive  processor  ooerates  primarily  on  general 
assertions  in  its  construction  of  proofs.  The  data  management  system  accesses  and 
retrieves  specific  facts  when  such  facts  are  needed  for  proof  completion.  The 
four  files  used  by  the  deductive  processor  are  the  general  assertion  file,  the 
predicate  connection  graph,  the  variable  substitution  file,  and  the  semantic  advice 
file.  These  files  have  been  segmented  for  purposes  of  processing  efficiency  and 
data  organization. 

General  Assertion  File.  The  deductive  processor  has  access  to  a file  of  general 
assertions,  or  premises.  These  premises  are  represented  in  a Skolemized, 
quantifier-free  form,  as  "primitive  conditional"  expressions.  Primitive  con- 
ditionals are  logical  statements  whose  major  connective  is  the  implication  sign. 

On  either  side  of  this  connective,  grouoings  of  literals  ray  be  combined  con- 
junctively or  disjunctively.  Each  literal  is  an  atomic  formula  (i.e.,  a predicate 
and  its  arguments)  or  a negated  atomic  formula.  The  primitive  conditional  is  a 
canonical  form  for  the  first-order  predicate  calculus.  This  form  facilitates 
finding  chains  of  deductively  linked  middle-term  predicates,  disolayina  inference 
plans  and  evidence  chains,  and  storing  information  in  such  a wav  that  the  strat- 
egic or  heuristic  implications  of  the  original  formulation  are  not  lost  in  the 
system,  as  is  often  the  case  with  other  canonical  forms  (e.g.,  the  conjunctive 
normal  form  used  in  resolution). 

A predicate  occurrence  is  uniquely  identified  bv  specifying  the  premise  in  which 
it  occurs,  the  predicate  name  of  which  it  is  an  occurrence,  its  ordinal  position 
in  the  premise,  whether  it  is  on  the  left  or  right  of  the  main  conditional, 
whether  it  is  negated  or  not,  and  whether  it  is  a member  of  a conjunctive  or  dis- 
junctive set.  For  each  predicate  occurrence,  the  above  information  is  represented 
by  a unique  compact  bit  string  (a  single  IBM  37(1  computer  word  in  our  current 
Implementation). 

Predicate  Connection  Graph.  The  predicate  connection  qraoh  is  abstracted  from  the 
Tnforma ti on*  avail abfe  in  the  premises.  Nodes  represent  predicate  occurrences. 

Each  edge  between  a pair  of  nodes  represents  a possible  deductive  interaction 
between  the  predicate  occurrences  in  the  nodes.  The  predicate  connection  graph  is 
of  key  importance  in  our  system.  It  reoresents  explicitly  and  compactly  a orcat 
deal  of  detailed  structural  information  about  qeneral  assertions  and  their  possible 
Interconnections.  This  information  is  in  a form  that  can  be  quickly  accessed  and 
scanned. 


•For  the  related  artificial-intelligence  problem  of  efficiently  using  a very  large 
knowledge  base,  see  McDermott  (1975)  and  Fahlman  (1975). 
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The  predicate  connection  aranh  hears  some  resemblance  to  the  oranh  nroof  Procedures 
of  Kowalski  (1R75),  Shostak  (1G76) , and  vates  et  al . (10701.  The  important  distinc- 
tion hetween  our  annroach  and  these  procedures  is  in  the  wav  thp  connection  oranh 
is  used.  As  we  will  see  in  thp  next  section,  the  predicate  connection  aranh  is 
used  to  develop  oossihle  nroof  plans.  The  oranh  is  an  abstraction  of  information 
about  premises  and  their  deductive  interactions  and  does  not,  bv  itself,  construct 
oroofs.  It  is  used  as  a nlannim  tool,  for  a full  discussion  of  the  predicate 
connection  aranh,  its  use  in  constructino  nroof  plans,  and  the  use  of  semantic 
advice  in  restriction  searches  tnrcuah  it,  see  Klahr  (1075). 

Figure  3 illustrates  a small  set  of  premises  and  their  renresentaticns  in 
primitive-conditional  form.  Fioure  A illustrates  the  predicate  connection  oranh 
for  these  pre-ises.  The  solid  lines  in  Fioure  A are  u-arcs  'these  “unification" 
arcs  represent  deductivo  interactions  hetween  different  occurrences  of  tne  same 
predicate).  These  arcs  are  corouted  whpn  premises  are  first  entered  into  the 
svstem.  Another  kind  of  information  in  the  predicate  connection  aranh  is  the 
deductive  deoendencv  link,  which  represents  deductive  rienendencv  between  predicate 
occurrences  within  a single  premise.  Two  of  the  four  kinds  of  dependency  links 
are  shown  in  Fiaure  A. 

Variable  Substitution  File.  Another  file,  also  separated  out  for  purposes  of 
efficiency,  is  tne  variable  substitution  file.  This  file  is  also  abstracted  from 
information  in  the  premises.  It  consists  of  the  substitutions  for  variables  that 
establish  the  unifications  represented  bv  the  u-arcs  in  the  predicate  connection 
graoh.  This  file  is  used  onlv  durino  the  verification  process,  when  the  sub- 
stitution lists  for  all  the  unifications  in  a nrnof  are  combined  and  checked  for 
consistent. 

Semantic  Advice  File.  Semantic  advice  can  be  o'  considerable  aid  in  deductive 
searching.  ft  permits  the  specification  and  use  of  deductivelv  significant  proce- 
dural serantic  information  specific  to  a particular  domain  of  discourse.  rrenuentlv, 
advice  cannot  be  formulated  directlv  within  the  lonical  symbolism  of  the  aeneral 
assertions,  even  when  a svnbolism  as  rich  as  that  of  primitive  conditionals  is 
used.  Whenever  such  advice  can  he  cantured,  it  can  verv  likelv  he  nut  to  aood  use 
in  simnlifvino  and  soeedinn  un  the  deductive  process.  In  the  advice  file,  seman- 
tic advice  is  formulated  ana  stored  as  condition-action  nairs.  (The  user  mav  also 
qive  nrohlem-snecific  advice  for  any  particular  nuerv.) 

DEDUCTIVE  PROCESSOR  modules 

Control.  The  control  module  provides  the  nrimarv  interface  between  a user's 
symbolic  input  (queries,  advice,  and  data),  the  several  deductive  processor  mod- 
ules, and  <k«  da*a  management  svstem.  For  example,  Control  mav  locate  semantic 
ad>'!»i?  in  the  advice  file  relevant  to  a specific  input  ouerv.  It  mav  then  call 
the  Middle-Term  Chain  Generator  and  the  Proof  Proposal  Generator  to  create  proof 
proposals.  Control  will  then  invoxe  tne  Proof  Proposal  Verifier  and  the  Data 
Management  Svstem  in  sequence  to  verify  and  complete  a proof.  Finally,  it  will 
call  the  Desponse  Generator  and  disnlav  the  answer  and  proof  to  the  user. 

Middle-Term  Chain  Generator.  The  predicate  connection  oranh  is  used  to  find  . 

cha ins  of  middle-tern  predicates  that,  deductivelv  link  the  assumption  and  qoal 
predicates  of  a quprv.  'he  basic  proof  strategies  of  natural  deduction,  Droof-bv- 
contradiction,  and  proof-hy-cases  are  automatically  incorporated  into  this  nred- 
cate  chain  generator  (since  the  nredicafp  dependency  links  reflected  in  the 
predicate  connection  nranh  are  of  the  different  kinds  needed  for  all  of  thpse 
Strategies:  see  klahr  (ln75)).  The  chain-generation  process  mav  be  visualized  as 
one  of  generating  a series  of  expanding  "wave  fronts'  from  each  assumption  and 
goal  predicate.  These  wave  fronts  represent  deductivelv  significant  nnssiMe  oaths 
from  each  predicate.  As  the  two  wave  fronts  expand,  intersections  are  taken  to 
determine  when  an  assumption  wave  front  impinges  unon  a goal  wave  front.  When  this 
happens,  the  svstem  has  discovered  the  beginning  of  a nroof  nlan. 


Ud 


187 


* 


DEDUCTIVE  DATA  MANAGEMENT 

1.  Husbands  and  wives  are  married  to  each  other. 

v(H(Xj,x2),  W(x1,x2))  =>  M(x1,x2) 

2.  Marriage  is  a symmetric  relation. 

M(*3,x4)  =>  M(x4,x3) 

3.  Spouses  of  Greeks  are  Greek. 

&(G(x5),  M(x5,x6))  * G(xg) 

4.  People  living  in  a place  located  in  Greece  are  Greek. 

&(Loc(x7 .Greece),  Li v(x8>x7) ) * G(xg) 

5.  Spouses  live  in  the  same  place. 

MmU9.x10),  Liv(xg,xu))  =?  Liv(x10>xu) 

Figure  3.  Sample  Premise  Set  in  Primitive-Conditional  Form. 


LEGEND: 

UNIFICATION  ARCS  IMPLICATION  LINKS  CONJUNCTIVE  LINKS 


Figure  4.  Predicate  Connection  Graph  for  the  Sample  Premise  Set. 

(P.n.m  is  the  occurrence  of  the  predicate  P in  premise 
n,  position  m.) 
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Proof  Proposal  Generator.  For  each  middle-term  chain  produced,  the  Proof  Pronosal 
Generator  extracts  the  premises  containing  the  nredicate  occurrences  in  thp  chain 
and  forms  skeletal  oroof  plans.  Those  nlans  are  later  sent  to  the  verifier,  which 
constructs  full  deductive  detail  for  the  proofs. 

Proof  Proposal  Verifier.  For  each  proof  proposal,  the  variables  and  substitutions 
for  them  in  the  Droof  structure  are  examined  to  determine  whether  there  are  any 
blockages  (variables  taking  on  conflicting  values).  If  verification  is  successful, 
the  Control  processor  examines  the  proof  to  see  whether  there  are  anv  remaining 
subproblems  that  need  sunoort  from  the  file  of  specific  facts.  If  facts  are  needed, 
the  data  management  system  is  called  to  search  for  these  facts  to  complete  the 
proof. 

Response  Generator.  When  the  svsfem  completes  a successful  verification  and 
instantiation  of  a oroof  proposal,  it  cutouts  answers  and,  if  desired,  the  deriva- 
tions on  which  thev  are  based,  l-.'hen  derivations  are  not  complete,  the  svstem  mav 
display  conditional  answers  or  partial  derivations  that,  in  many  cases,  will  pro- 
vide clues  to  missing  information  that  the  user  may  be  able  to  acauire  from  sore 
other  source. 

A BUSINESS  INFORMATION  FT-V-’PLF 

As  an  example  of  how  our  svstem  works,  let  us  assume  that  we  have  the  task  of 
maintaining  as  complete  and  accurate  a picture  as  possible  of  the  operation  of  a 
large  business  organization.  This  will  include  the  need  to  understand  the  various 
factions,  real  decision-~akir.o  loci  within  the  organization,  and  real  flows  of 
control  and  information,  as  exposed  to  ouhliclv  announced  ones  or  the  ones  that 
appear  on  a formal  organization  chart  as  in  Figure  5.  This  chart  is  for  a fic- 
tional com.oany  that  is  a large  distributing  organization  with  three  major  line 
divisions:  'Chemical,  Drug,  and  Ligucr. 

We  should  stress  that  hiah-level  questions  such  as  "Is  the  Druo  faction  or  the 
Liquor  faction  in  supremacy?"  have  to  involve  a considerable  amount  of  human 
interpretation.  A user  should  not  e.xnect  a ccmouter  svstem,  even  one  caoahle  of 
sophisticated  deduction,  to  generate  categorical  answers  directly  for  such  Ques- 
tions. What  a deductive  data  management  svsten  can  do,  however,  is  help  to  collect 
and  organize  evidence  ‘or  gr  against  a general  conclusion  or  working  hypothesis. 

It  is  important  to  note  that  an  inference  svstem  mav  make  use  of  both  certain  and 
plausible  information  in  the  generation  of  arguments  and  in  the  display  of 
evidence.  That  is,  the  evidence  may  be  strong  or  weak;  the  human  interpreter  must 
Judge  which. 

Let  us  suppose  that  a new  man,  Zemhruski,  has  been  aopointed  executive  vice 
president  and  head  of  the  Chemical  division.  We  know  something  about  him  but 
this  information  is  soottv  ard  incomplete.  The  task  is  to  work  nut  deductive 
connections  that  might  provide  evidence  toward  concluding  whether  his  aooointment 
should  he  considered  a victory  for  the  Drug  faction  or  the  Liouor  faction.  For 
ournoses  of  simnlicitv,  we  will  focus  on  one  nrinciDal  relationship  that  night 
bear  on  this  auestion.  This  is  the  notion  of  informal  information  flow  among 
individuals.  We  take  this  notion  to  cover  the  nossiMe  exchanges  of  information 
between  ppoole  who  are  friends,  relatives,  co-workers,  spouses,  etc.  Tf  we  can 
obtain  information  that  will  allow  us  to  deduce  various  instances  of  information 
flow,  we  might  gain  evidence,  for  example,  that  7erhruski  has  manv  more  informational 
contacts  with  executives  in  the  Liquor  division  than  he  does  with  those  in  the  Drug 
division. 

Suppose  that  Enqler  is  vice  president  and  head  of  Liquor.  We  can  ask  the  auestion, 
“Will  there  be  Information  flow  between  Fnqler  and  Zemhruski?"  Notice  that  while 
this  is  a ves/no  question,  we  will  not  he  satisfied  with  iust  a simole  "ves"  or 
"no."  We  will  want  to  have  access  to  the  facts  and  the  general  assertions  used  by 
the  Inference  mechanism  in  its  derivation. 
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Figure  5.  Organization  Chart. 
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In  Figures  6 and  7 we  dive  an  example  set  of  soecific  facts  and  aeneral  assertions 
that  are  pertinent  to  questions  of  information  flow.  Of  course,  in  an  actual 
situation,  there  could  he  hundreds  of  additional  aeneral  assertions  in  the  system, 
and  thousands  of  additional  soecific  facts. 

One  successful  derivation  that  our  system  could  come  up  with,  aiven  the  assertions 
and  facts  in  Figures  6 and  7,  is  illustrated  in  Fiqure  8.  When  our  system  is  given 
a complex  question,  it  is  broken  down  into  a set  of  assumption  predicates  and  a 
set  of  goal  predicates.  When  both  assumptions  and  coals  exist,  the  Middle-Term 
Chain  Generator  is  invoked  to  find  middle-term  predicate  occurrences  lirkinq 
assumptions  and  goals.  We  will  see  an  example  of  this  process  in  the  next  example. 
For  the  current  query  we  have  only  a single  goal,  namelv,  to  establish  information 
flow  between  Encler  and  Zembruski.  In  this  situation  the  system  will  back  up  from 
the  goal  statement  as  illustrated  in  Figure  8 and  select  premises  that  can 
deductively  lead  to  the  establishment  of  the  goai. 


Each  premise  in  Fiaure  8 renresents  an  instance  of  a aeneral  assertion.  The  coal 
statement  is  enclosed  in  a rectanole,  as  are  the  two  specific  facts  obtained  from 
the  data  base.  The  vertical  lines  connectina  the  instances  of  predicates--for 
examole,  the  line  connectina  two  instances  of  Sibling--are  unification  arcs  that 
are  located  by  searching  the  predicate  connection  qranh.  It  is  within  this  araoh 
that  the  system  finds  the  linkanes  that  enable  it  to  link  Brother  to  Sihlina, 
Sibling  to  Relative,  Relative  to  Nepotism,  Nepotism  to  Friendship,  and,  finally, 
Friendship  to  Information-flow. 

The  general  assertions  used  in  Fiqure  8 are  not  all  strictly  true.  For  example, 
the  premise  concerning  nepotism  for  relatives  of  subordinate  e~nloyees  is  sometimes 
true  (for  certain  circumscribed  contexts  or  situations',  hut  is  clearly  not  alwavs 
true.  We  anticipate  many  uses  in  our  svstem  of  such  nlausible  premises.  Where 
this  is  done,  it  is  clearly  important  to  be  able  perspicuously  to  display  to  a 
user  the  lines  of  logical  argument  that  are  being  followed  by  the  svstem,  so  that 
the  user  can  evaluate  the  credibility  of  a conclusion  drawn  from  such  questionable 
premises.  Our  system  permits  the  discovery  of  alternative  derivations  for  the 
same  conclusion,  which  may  enhance  the  credibility  of  the  conclusion. 

We  note  here  a possible  use  of  semantic  advice.  The  user  could  suggest  the  use  of 
particular  premises  or  predicates  that  he  feels  nay  he  appropriate  to  a particular 
query.  For  the  current  query,  he  may  feel  the  oremise  concerning  nepotism  for 
relatives  may  be  appropriate  to  establish  information  flow.  The  system  would  try 

1.  Zembruski  is  division  head  of  Chemical:  Head(Zembruski .Chemical ) 

2.  Engler  is  division  head  of  Liquor:  Head(Fngler,Liauor) 

3.  Richard  Z.  is  a line  subordinate  of  Fnaler:  Line-sub (Richard  7.,Fnq1er) 

4.  King  is  a line  subordinate  of  Fngler:  Line-sub(King,Fnoler) 

5.  MR-Aces  is  a bridge  club:  Bridge-club(MR-Aces) 

6.  Rita  S.  is  a member  of  MR-Aces:  Member(Rita  S., MR-Aces) 

7.  Ann  K.  is  a member  of  MR-Aces:  Member(Ann  K., Mp.-Aces) 

8.  Rita  S.  is  the  wife  of  Smythe:  Wife(Rita  S.,Snythe) 

9.  Ann  K.  is  the  wife  of  King:  Wife(Ann  K.,King) 

10.  Richard  Z.  is  the  brother  of  7embruski:  Brother(Richard  Z. .Zembruski) 
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Figure  6.  Specific  Facts. 
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1.  Brothers  and  sisters  are  siblings. 

Vx.y  ( v(Brother(x,y) , Sister(x.y))  =>  Sibling(x.y)  ) 

2.  Wives  maintain  information  flow  with  their  husbands. 

Vx,y  ( Wife(x.y)  =>  Info-flow(x.y)  ) 

3.  Information  flow  runs  between  friends. 

Vx,y  ( Friend(x.y)  =*•  Info-flow(x.y)  ) 

4.  Every  worker  who  is  not  a chairman  of  the  board  has  a boss. 

Vx3y  ( 8(Worker(x),  Chairman(x))  =5>  Boss(y.x)  ) 

5.  Every  line  subordinate  is  a worker  and  a subordinate. 

Yx,.v  ( Line-sub(x.y)  ^ .’.(Worker'x) , Subord(x,y))  ) 

6.  Every  staff  subordinate  is  a worker  and  a subordinate. 

Yx,v  ( Staff-sub(x.y)  =>  & (Worker (x ) , Subord(x.y))  ) 

7.  If  someone  does  a nepotistic  favor  for  another,  then  they  are  friends. 

Yx,y  ( Kepot(x.y)  =*■  Friend(x,y)  ) 

8.  Staff  subordinates  have  information  flow  with  their  superiors. 

Vx.y  ( Staff-sub(x.y)  Info-flow(x,.v)  ) 

9.  Line  subordinates  have  information  flow  with  their  superiors. 

Yx,y  ( Line-sub(x,y)  =>  Info-flow(x.y)  ) 

10.  There  is  a boss  who  has  information  flow  with  everyone  of  his  subordinates. 

3x  Yy  ( &(6css(x,y),  Subord(v,x))  =s>  Info-flow(x.y)  ) 

11.  If  someone  is  a subordinate  of  another,  the  suoerior  nay  do  a nepotistic 
favor  for  a relative  of  the  subordinate. 

Vx,y,z  ( &(Subord(x,y) , Relative(x.z) ) =s»  NeDot(y.z)  ) 

12.  People  who  are  cousins  or  siblings  are  relatives. 

Vx.y  ( v(Cousin(x,y) , Sibling(x.y))  o Relative(x ,y)  ) 

13.  Members  of  a bridge  club  maintain  information  flow  with  each  other. 

v-”.,y,z  ( &(Bridge-club(x) , Member(y,x),  Menber(z.x))  Info-flow(y ,z)  ) 

14.  The  subordinate  relation  is  transitive. 

Vx,y,z  ( &(Subord(x,y) , Subord(y,z))  & Subord(x,z)  ) 

15.  Information  flow  is  transitive. 

Vx,y,z  ( &(Info-flow(x,y) , Info-flow(y,z))  Info-flow(x.z)  ) 

16.  Information  flow  is  symetric.  1 

Vx.y  ( Info-flow(x.y)  * Info-flow(y.x)  ) 


figure  7.  General  Assertions. 
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to  use  that  premise  in  its  nroof.  Similarly,  if  the  user  felt  the  relation 
"nepotism"  is  of  kev  importance,  the  system  would  he  alerted  to  link  through 
occurrences  of  that  predicate  when  possible.  Note  also  that  such  advice  could 
be  placed  in  the  advice  file  for  general  use  (e.g.,  if  coal  is  Info-flow,  use 
Nepotism  as  middle  term,  or  in  more  symbolic  form  "Goal {info-flow) ; (Nepotism)’1). 

To  show  another  capability  of  the  system,  consider  the  following  situation.  We 
ask  the  system  "What  if  Zembruski  were  to  appoint  Smvthe  to  his  staff;  would  there 
then  be  information  flow  between  Engler  and  Zembruski?"  Here  we  are  suggesting 
the  use  of  a particular  assumotion.  Using  this  assumotion  the  system  will  try  to 
establish  the  same  goal  as  in  the  earlier  example. 

A successful  derivation  for  this  ouerv  is  shown  in  Figure  9.  This  proof  uses 
premises  quite  different  from  those  in  the  first  derivation.  This  second  derivation 
is  more  complex  than  the  first.  We  can  get  the  oist  of  the  argument,  however,  if 
we  follow  through  a few  of  the  implications  of  the  boxed  specific  facts  that  have 
been  obtained  from  the  data  base.  In  order  deductively  to  link  Staff-suh(Smythe, 
Zembruski)  to  the  coal  lnfo-flow(Enqler , Zembruski ) , the  svstem  has  to  determine 
that  Smythe  has  a wife,  Rita;  that  the  MR-Aces  is  a bridge  club;  that  Rita  is  a 
member  of  that  bridge  club;  that  Ann  K.  (the  wife  of  employee  King)  beloncs  to  the 
same  bridge  club,  and  hence,  via  a general  assertion,  has  possible  information 
flow  with  Rita;  and  that  King  is  a line  subordinate  of  Encler.  Together,  these 
relations  deductively  establish  that  there  may  indeed  he  information  flow  between 
Engler  and  Zembruski,  namely  through  the  wives  of  two  of  their  subordinates. 

We  can  use  this  example  to  show  the  basic  operation  of  the  svstem.  The  svstem  is 
given  an  assumption  and  a goal.  The  Middle-Term  Chain  Generator  attempts  to  find 
a chain  of  oredicate  occurrences  that  deductively  link  assumption  to  goal  via  the 
premises.  The  oredicate  connection  oranh,  which  contains  information  on  the 
deductive  connections  between  nremises,  is  used  in  this  chain-generation  process. 

If  the  chain  generator  is  successful  and  produces  a chain,  the  “roof  Proposal 
Generator  extracts  the  set  of  premises  containing  the  occurrences  in  the  chain, 
and  the  system  has  the  beginning  of  a proof  olan.  In  Figure  9,  the  set  of  premises 
on  the  right  were  formed  as  a result  of  a chain  linkirq  the  assumption  to  the 
goal  via  occurrences  of  the  predicate  Info-flow. 

The  Proof  Proposal  Generator  then  examines  the  set  of  premises  to  determine  whether 
subprohlems  remain.  In  Figure  9,  four  suboroblems  were  formed  and  are  resolved 
using  the  four  premises  on  the  left.  Subproblems  resulting  from  these  four  Dremises 
are  specified  as  needinq  fact-file  support.  (We  have  previously  indicated  to  the 
system  that  certain  predicates  should  be  left  for  data-base  search,  either  because 
we  have  complete  knowledge  about  certain  Dredicates  such  as  Line-sub  in  an 
organization  or  because  the  information  can  be  easily  determined  hy  the  user,  such 
as  the  wives  of  employees.  In  the  latter  case,  the  system  would  essential 1 v be 
giving  the  user  a "conditional  answer,"  leaving  certain  subproblems  open  for  user 
completion. ) 

Once  all  subproblems  have  been  deductively  resolved  or  left  for  fact-file  support, 
the  Proof  Proposal  Verifier  combines  the  substitutions  of  all  the  unifications  in 
the  proof  to  check  for  consistency.  Inconsistency  occurs  if  a variable  is  required 
to  take  on  two  different  constant  values  simultaneously.  If  verification  is 
successtul , the  data  management  system  is  invoked  to  locate  the  soecific  facts 
needed  for  proof  completion.  In  Figure  9,  the  six  facts  shown  complete  the  proof. 

While  we  have  suggested  the  importance  of  displaying  evidence  for  (or  against)  a 
deduced  answer,  we  can  readilv  see  that  derivations  such  as  the  one  illustrated  in 
Figure  9 may  become  difficult  to  follow.  It  is  important  to  disnlav  machine- 
generated  logical  arguments  in  as  perspicuous  ar.d  user  oriented  a form  as  possible. 
We  are  currently  developing  techniques  to  display  English-like  formulations  for 
such  proofs . 
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SUMMARY 


We  have  argued  that  Inference  mechanisms  can  significantly  enhance  the  power  and 
usability  of  a data  management  system.  They  enable  the  compact  storage  of  a large 
amount  of  information  in  the  form  of  general  assertions,  and  they  enable  the  com- 
bination of  these  assertions,  with  explicitly  stored  specific  facts,  to  deduce 
other  specific  facts  that  would  otherwise  not  be  available. 

Perhaps  just  as  Important,  a deductive  capability  in  a data  management  system  can 
enable  extendability  of  user  language.  The  general  assertions  used  by  the  deduc- 
tive mechanisms  can  defir,’ tioral ly  connect  the  concents  used  by  the  data  management 
system  for  organizing  its  data  base  to  different  concepts  more  appropriate  fcr  a 
particular  user  community. 

We  have  briefly  described  a deductive  system  specifically  designed  to  provide 
inferential  capability  for  a data  management  system.  From  various  files  containing 
information  extracted  from  general  assertions,  the  system  generates  middle-term 
chains,  which  it  combines  into  proof  proposals.  These  proposals  are  then  used  in 
the  generation  of  data-base  search  requests  for  concrete  facts,  which,  in  turn, 
transform  proposals  into  complete  proofs  and  answers. 

Applying  deduction  to  practical  question-ansuering  in  realistic  environments 
requires  special  attention  to  the  previously  unsolved  problem  of  efficiently 
selecting,  from  very  large  files  of  specific  facts  and  general  assertions,  the 
very  few  that  are  relevant  for  a particular  deduction.  Our  approach  to  this 
selection  problem  involves  constructing  abstract  proof  plans  and  then  iteratively 
fleshing  them  out  with  more  and  more  detail.  Particular  facts  and  assertions  are 
selected  for  trial  only  when  they  *it  into  some  plan.  Semantic  advice,  i.e., 
advice  specific  to  a particular  subject  domain,  can  be  used  to  guide  the  construc- 
tion and  articulation  of  proof  plans.  On  the  basis  of  our  experiments  so  far, 
the  approach  locks  promising. 
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